Kenneth Tay
Oct 23, 2018
maps
packagemaps
package contains a lot of outlines of continents, countries, states, and countiesggplot2
’s map_data
function puts these outlines in data frame format, which then allows us to plot them with ggplot
library(ggplot2)
library(maps)
library(dplyr)
library(readr)
county_data <- map_data("county")
CA_data <- county_data %>% filter(region == "california")
head(CA_data)
## long lat group order region subregion
## 1 -121.4785 37.48290 157 6965 california alameda
## 2 -121.5129 37.48290 157 6966 california alameda
## 3 -121.8853 37.48290 157 6967 california alameda
## 4 -121.8968 37.46571 157 6968 california alameda
## 5 -121.9254 37.45998 157 6969 california alameda
## 6 -121.9483 37.47717 157 6970 california alameda
County outlines are drawn using geom_polygon
.
ggplot(data = CA_data) +
geom_polygon(mapping = aes(x = long, y = lat, group = group))
coord_quickmap()
preserves the aspect ratio of the map.
ggplot(data = CA_data) +
geom_polygon(mapping = aes(x = long, y = lat, group = group)) +
coord_quickmap()
drought_data <- read_csv("Drought for Session 7.csv")
head(drought_data, n = 3)
## # A tibble: 3 x 2
## County Drought_percent
## <chr> <dbl>
## 1 alameda 100
## 2 alpine 100
## 3 amador 100
head(CA_data, n = 3)
## long lat group order region subregion
## 1 -121.4785 37.4829 157 6965 california alameda
## 2 -121.5129 37.4829 157 6966 california alameda
## 3 -121.8853 37.4829 157 6967 california alameda
Our drought data and mapping information are in different datasets!
Solution: Use joins (in dplyr
package).
combined_data <- CA_data %>%
left_join(drought_data, by = c("subregion" = "County"))
head(combined_data)
## long lat group order region subregion Drought_percent
## 1 -121.4785 37.48290 157 6965 california alameda 100
## 2 -121.5129 37.48290 157 6966 california alameda 100
## 3 -121.8853 37.48290 157 6967 california alameda 100
## 4 -121.8968 37.46571 157 6968 california alameda 100
## 5 -121.9254 37.45998 157 6969 california alameda 100
## 6 -121.9483 37.47717 157 6970 california alameda 100
Map the fill
attribute of geom_polygon
to the Drought_percent
column.
ggplot(data = combined_data) +
geom_polygon(mapping = aes(x = long, y = lat,
group = group, fill = Drought_percent)) +
coord_quickmap()
Use scale_fill_distiller
to define a more appropriate color scale.
ggplot(data = combined_data) +
geom_polygon(mapping = aes(x = long, y = lat,
group = group, fill = Drought_percent)) +
scale_fill_distiller(palette = "YlOrRd", direction = 1) +
coord_quickmap()
Optional material
Inner join: Matches pairs of observations with equal keys, drops everything else. Hence, only keeps observations which appear in both datasets.
After matching pairs of observations with equal keys…